Automatic indexing of PDF documents with ontologies

نویسندگان

  • Anjo Anjewierden
  • Suzanne Kabel
چکیده

Indexing large bodies of data is necessary to enable satisfactory search results. Ontologies serve as fixed vocabularies to index data from different viewpoints. We describe how AIDAS, a software tool, automatically divides the source data (PDF documents) into reusable chunks, how it automatically indexes these chunks and stores them in a database to enable reuse.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic indexing of documents with ontologies

Indexing large bodies of data is necessary to enable satisfactory search results. Ontologies serve as fixed vocabularies to index data from different viewpoints. We describe how AIDAS, a software tool, automatically divides the source data (PDF documents) into reusable chunks, how it automatically indexes these chunks and stores them in a database to enable reuse.

متن کامل

Automatic Workflow Generation and Modification by Enterprise Ontologies and Documents

This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...

متن کامل

Automatic Workflow Generation and Modification by Enterprise Ontologies and Documents

This article presents a novel method and development paradigm that proposes a general template for an enterprise information structure and allows for the automatic generation and modification of enterprise workflows. This dynamically integrated workflow development approach utilises a conceptual ontology of domain processes and tasks, enterprise charts, and enterprise entities. It also suggests...

متن کامل

Automatic Multi-label Subject Indexing in a Multilingual Environment

This paper presents an approach to automatically subject index fulltext documents with multiple labels based on binary support vector machines (SVM). The aim was to test the applicability of SVMs with a real world dataset. We have also explored the feasibility of incorporating multilingual background knowledge, as represented in thesauri or ontologies, into our text document representation for ...

متن کامل

Document indexing for automatic semantic annotation support

Nowadays, capturing the knowledge in ontological structures is one of the primary focuses of the knowledge management research. To exploit the knowledge from the vast quantity of existing unstructured texts available in natural languages in ontologies, tools for automatic semantic annotation (ASA) are heavily needed. In this paper, we present an approach to ASA and a method for documents conten...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001